Goto

Collaborating Authors

 Washington County


PropNet: a White-Box and Human-Like Network for Sentence Representation

Yang, Fei

arXiv.org Artificial Intelligence

Transformer-based embedding methods have dominated the field of sentence representation in recent years. Although they have achieved remarkable performance on NLP missions, such as semantic textual similarity (STS) tasks, their black-box nature and large-data-driven training style have raised concerns, including issues related to bias, trust, and safety. Many efforts have been made to improve the interpretability of embedding models, but these problems have not been fundamentally resolved. To achieve inherent interpretability, we propose a purely white-box and human-like sentence representation network, PropNet. Inspired by findings from cognitive science, PropNet constructs a hierarchical network based on the propositions contained in a sentence. While experiments indicate that PropNet has a significant gap compared to state-of-the-art (SOTA) embedding models in STS tasks, case studies reveal substantial room for improvement. Additionally, PropNet enables us to analyze and understand the human cognitive processes underlying STS benchmarks.


Language Models Represent Space and Time

Gurnee, Wes, Tegmark, Max

arXiv.org Artificial Intelligence

The capabilities of large language models (LLMs) have sparked debate over whether such systems just learn an enormous collection of superficial statistics or a coherent model of the data generation process -- a world model. We find preliminary evidence for the latter by analyzing the learned representations of three spatial datasets (world, US, NYC places) and three temporal datasets (historical figures, artworks, news headlines) in the Llama-2 family of models. We discover that LLMs learn linear representations of space and time across multiple scales. These representations are robust to prompting variations and unified across different entity types (e.g. cities and landmarks). In addition, we identify individual ``space neurons'' and ``time neurons'' that reliably encode spatial and temporal coordinates. While further investigation is needed, our results suggest modern LLMs learn rich spatiotemporal representations of the real world and possess basic ingredients of a world model.


Amazon Web Services AI exec: How cloud computing is driving artificial intelligence breakthroughs

#artificialintelligence

Artificial intelligence research is still in its infancy, at least as compared to computer science in general, but the concept of unlimited computing resources is accelerating the field. As someone with nearly unlimited computing resources at his disposal, this is something Swami Sivasubramanian, vice president of AI at Amazon Web Services, is watching play out. Last week Sivasubramanian walked GeekWire Cloud Tech Summit attendees through the array of artificial intelligence and machine-learning services that his team has developed for AWS customers and Amazon's own internal services as well. If you've been through a few tech cycles, you've already heard a lot about artificial intelligence. Much has been promised from this research field over several decades, but the enormous amount of data now moving into cloud computing services like AWS and others allows researchers like Sivasubramanian to make real breakthroughs that weren't possible when data sets were scattered and siloed.